A New Textual/Non-Textual Classifier for Document Skew Correction

نویسندگان

Xiaoyan Zhu

Xiaoxin Yin

چکیده

This work is supported by National Nature Science Foundation of China (69982005) and Projects of Development Plan of the State Key Foundation Search (G199803050703) Abstract: A robust approach is proposed for document skew detection. We use Fourier analysis and SVM to classify textual areas from non-textual areas of documents. We also propose a robust method to determine the skew angle from textual areas. Our approach achieves good performance on documents with large area of non-textual contents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-textual Document Images: A Novel Approach

Skew angle estimation and correction of a document page is an important task for document analysis and optical character recognition (OCR) applications. Many approaches of skew detection can process pure textual document images successfully. But it is a challenging problem to process documents like handwritten, large areas of non-textual contents. In this direction, a novel approach for textual...

متن کامل

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

Skew Correction of Textural Documents

Two algorithms for accurate skew detection and correction of textual documents are presented. They depend on finding a horizontal RLSA image of the skewed document. The average skew of selected black connected components in the RLSA image is considered as the skew angle for the whole document which is finally rotated in the opposite direction by that amount to obtain the final corrected image. ...

متن کامل

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

متن کامل

Fiducial line based skew estimation

Skew estimation for textual document images is a well-researched topic and numerals of methods have been reported in the literature. One of the major challenges is the presence of interfering non-textual objects of various types and quantities in the document images. Many existing methods require proper separation of the textual objects which are well aligned from the non-textual objects which ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

A New Textual/Non-Textual Classifier for Document Skew Correction

نویسندگان

چکیده

منابع مشابه

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-textual Document Images: A Novel Approach

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Skew Correction of Textural Documents

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Fiducial line based skew estimation

عنوان ژورنال:

اشتراک گذاری